26 research outputs found

    apk2vec: Semi-supervised multi-view representation learning for profiling Android applications

    Full text link
    Building behavior profiles of Android applications (apps) with holistic, rich and multi-view information (e.g., incorporating several semantic views of an app such as API sequences, system calls, etc.) would help catering downstream analytics tasks such as app categorization, recommendation and malware analysis significantly better. Towards this goal, we design a semi-supervised Representation Learning (RL) framework named apk2vec to automatically generate a compact representation (aka profile/embedding) for a given app. More specifically, apk2vec has the three following unique characteristics which make it an excellent choice for largescale app profiling: (1) it encompasses information from multiple semantic views such as API sequences, permissions, etc., (2) being a semi-supervised embedding technique, it can make use of labels associated with apps (e.g., malware family or app category labels) to build high quality app profiles, and (3) it combines RL and feature hashing which allows it to efficiently build profiles of apps that stream over time (i.e., online learning). The resulting semi-supervised multi-view hash embeddings of apps could then be used for a wide variety of downstream tasks such as the ones mentioned above. Our extensive evaluations with more than 42,000 apps demonstrate that apk2vec's app profiles could significantly outperform state-of-the-art techniques in four app analytics tasks namely, malware detection, familial clustering, app clone detection and app recommendation.Comment: International Conference on Data Mining, 201

    LA-HCN: Label-based Attention for Hierarchical Multi-label TextClassification Neural Network

    Full text link
    Hierarchical multi-label text classification (HMTC) has been gaining popularity in recent years thanks to its applicability to a plethora of real-world applications. The existing HMTC algorithms largely focus on the design of classifiers, such as the local, global, or a combination of them. However, very few studies have focused on hierarchical feature extraction and explore the association between the hierarchical labels and the text. In this paper, we propose a Label-based Attention for Hierarchical Mutlti-label Text Classification Neural Network (LA-HCN), where the novel label-based attention module is designed to hierarchically extract important information from the text based on the labels from different hierarchy levels. Besides, hierarchical information is shared across levels while preserving the hierarchical label-based information. Separate local and global document embeddings are obtained and used to facilitate the respective local and global classifications. In our experiments, LA-HCN outperforms other state-of-the-art neural network-based HMTC algorithms on four public HMTC datasets. The ablation study also demonstrates the effectiveness of the proposed label-based attention module as well as the novel local and global embeddings and classifications. By visualizing the learned attention (words), we find that LA-HCN is able to extract meaningful information corresponding to the different labels which provides explainability that may be helpful for the human analyst.Comment: code is available at https://github.com/XinyiZ001/LA-HC

    Optimization and clinical validation of a pathogen detection microarray

    Get PDF
    New design and optimization of pathogen detection microarrays is shown to allow robust and accurate detection of a range of pathogens. The customized microarray platform includes a method for reducing PCR bias during DNA amplification

    LibiD: Reliable identification of obfuscated third-party android libraries

    Get PDF
    Third-party libraries are vital components of Android apps, yet they can also introduce serious security threats and impede the accuracy and reliability of app analysis tasks, such as app clone detection. Several library detection approaches have been proposed to address these problems. However, we show these techniques are not robust against popular code obfuscators, such as ProGuard, which is now used in nearly half of all apps. We then present LibID, a library detection tool that is more resilient to code shrinking and package modification than state-of-the-art tools. We show that the library identification problem can be formulated using binary integer programming models. LibID is able to identify specific versions of third-party libraries in candidate apps through static analysis of app binaries coupled with a database of third-party libraries. We propose a novel approach to generate synthetic apps to tune the detection thresholds. Then, we use F-Droid apps as the ground truth to evaluate LibID under different obfuscation settings, which shows that LibID is more robust to code obfuscators than state-of-the-art tools. Finally, we demonstrate the utility of LibID by detecting the use of a vulnerable version of the OkHttp library in nearly 10% of 3,958 most popular apps on the Google Play Store.The Boeing Company, China Scholarship Council, Microsoft Researc

    Search for single production of vector-like quarks decaying into Wb in pp collisions at s=8\sqrt{s} = 8 TeV with the ATLAS detector

    Get PDF

    Measurement of the charge asymmetry in top-quark pair production in the lepton-plus-jets final state in pp collision data at s=8TeV\sqrt{s}=8\,\mathrm TeV{} with the ATLAS detector

    Get PDF

    ATLAS Run 1 searches for direct pair production of third-generation squarks at the Large Hadron Collider

    Get PDF

    A low power portable bioelectronics holter monitor for arrhythmias detection - B

    No full text
    Standard Electrocardiogram (ECG), only records the patient’s heart electrical activity for a short duration and the patients are not encouraged to move throughout the recording process. However, abnormality may not occur during this period. Therefore, there is a need for a portable ECG device, also known as Holter monitor, which provides good accuracy and comfort, to record ECG while patient is carrying out daily activities. This report documents the research and design of a low power Holter monitor for detection of arrhythmias. The first part of the report includes literature review and research on the understanding of heart electrical activity and the different types of arrhythmias. In addition, research conducted on the interference that may affect the recording of Ambulatory Electrocardiogram was also presented. The next part of the report involves the methodology of designing and construction of the hardware. The hardware design involves the use of low power components, and encompasses different noise cancellation techniques. The software design optimizes battery life with the help of a microchip, analysis the ECG signal detected and decide on the activation of wireless module. Finally, the last part of the report presents the conclusion and also the future recommendation for this project.Bachelor of Engineerin

    Program analysis and machine learning techniques for mobile security

    No full text
    Over the past few years, concerns have been raised with respect to the increasing number of malicious and clone apps infiltrating the Android markets. Android malware may perform a range of malicious activities (e.g., exfiltrating sensitive information and sending premium SMS) and clone apps steal revenue from the original developer. The detection of these adversary apps is non-trivial as in depth understanding of the apps is required. Furthermore, due to the arms race between the adversary apps and the detection algorithms, the adversary apps are constantly evolving and becoming more sophisticated. Hence, new and more effective algorithms are imperative. This thesis proposes three methods and one empirical study with suggested solutions for Android apps analysis. We address four specific issues that plague Android security, namely, clone detection, third-party library detection, malware detection and concept drift. We do so through leveraging on program analysis, Machine Learning and Deep Learning techniques.Doctor of Philosoph
    corecore